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Abstract — Chaotic neural networks have received a great deal 
of attention these last years. In this paper we establish a precise 
correspondence between the so-called chaotic iterations and a 
particular class of artificial neural networks: global recurrent 
multi-layer perceptrons. We show formally that it is possible to 
make these iterations behave chaotically, as defined by Devaney, 
and thus we obtain the first neural networks proven chaotic. 
Several neural networks with different architectures are trained 
to exhibit a chaotical behavior. 

I. Introduction 

Due to the widespread use of the Internet and new digital 
technologies in nowadays life, security in computer appli- 
cations and networks never was such a hot topic. Digital 
rights managements, e-voting security, anonymity protection, 
and denial of services are examples of new security concerns 
appeared this last decade. Tools on which this security is based 
are, among others: hash functions, pseudo-random number 
generators, cryptosystems, and digital watermarking schemes. 
Due to their wide use in security protocols, these tools are 
targeted everyday by hackers and new threats are frequently 
revealed. For example, security flaws have been recently iden- 
tified in the previous standard in hash functions called SHA-1 
|[T). As the new standards (SHA-2 variants) are algorithmically 
similar to SHA-1, stronger hash functions using new concepts 
are desired. 

New approaches based on chaos are frequently proposed 
as an alternative to solve concerns which recurrently appear 
in the computer science security field Q-Q. The advantage 
of the use of chaotic dynamics for security problems lies in 
their unpredictability proved by the mathematical theory of 
chaos. This theory brings many qualitative and quantitative 
tools, namely ergodicity, entropy, expansivity, and sensitive 
dependence to initial conditions |5|. These tools allow the 
study of the randomness of the disorder generated by the 
considered system |6|. 

Recently, many researchers have built chaotic neural net- 
works in order to use it as a component of new proposed hash 
functions fT^, pseudo-random number generators, cryptosys- 
tems 1 8 1, |9|, and digital watermarking schemes. Since the 
first introduction by McCulloch and Pitts in 1959, artificial 
neural networks have been shown to be efficient non-linear 
statistical data modeling tools which can implement complex 
mapping functions. Hence, they may be trained to learn a 
chaotic process and also, by construction, exhibit suitable 



properties: data confusion and diffusion, one-way function and 
compression. Security is not the only application domain of 
such new tools: the existence of chaos in our brain has been 
recently revealed, and the use of a chaotic artificial neural 
network as a model can serve, for example, neuroscientists in 
their attempts to understand how the brain works. 

However, using an element of chaos as a component of 
the scheme is not sufficient, in our opinion, to be able to 
claim that the whole process behaves chaotically. We believe 
that this claim is not so evident and must be proven. Let us 
notice that up to now the proposed chaotical neural networks 
have failed to convince the mathematics community due to 
a lack of proof. This is why it is explained in this paper 
how it is possible to build an artificial neural network that 
behaves chaotically, as it is defined by Devaney [ lOJ . We will 
establish a correspondence between particular neural networks 
and chaotic iterations, which leads to the definition of the first 
artificial neural network proven chaotic, according to Devaney. 

The remainder of this paper is organized as follows. The 
next section is devoted to some recalls on chaotic iterations and 
Devaney's chaos, followed by a brief description of artificial 
neural networks (ANNs). Section III presents a review of 
some works related to chaotic neural networks. Our approach, 
which consists in building a global recurrent ANN whose iter- 
ations are chaotic, is formalized and discussed in Section |IV] 
Concrete examples of chaotic neural networks also show the 
relevance of our method. Finally in Section |V] we conclude 
and outline future work. 

II. Basic Recalls 

In the sequel S"" denotes the n*'' term of a sequence S and 
Ei denotes the i*'* component of a vector E. f'^ = f o ...o f is 
for the fc*'' composition of a function /. Finally, the following 
notation is used: |1; iV] = {1, 2, . . . , A''}. 

A. Chaotic iterations versus Devaney's chaos 

1 ) Chaotic Iterations: Let us consider a system with a finite 
number N e IN* of elements (or cells), so that each cell has 
a boolean state. A sequence of length N of boolean states of 
the cells corresponds to a particular state of the system. A 
sequence which elements belong to Jl; N] is called a strategy. 
The set of all strategies is denoted by §. 



Definition 1 The set B denoting {0, 1}, let / : — ^ B^ 
be a function and S* e § be a strategy. The so-called chaotic 
iterations are defined by e B^ and 

Vn e lN*,Vz e II; Nl,^^ = | |[ t [ 

In other words, at the rt*'* iteration, only the S*"— th cell 
is "iterated". Note that in a more general formulation, S" 
can be a subset of components and (/(a;"~^))g„ can be 
replaced by (/(x'^)) „„, where k < n, describing for example, 
delays transmission f 1 1 \ . Finally, let us remark that the term 
"chaotic", in the name of these iterations, has a priori no link 
with the mathematical theory of chaos, recalled below. 

2) Devaney's chaotic dynamical systems: Consider a topo- 
logical space {X,t) and a continuous function / on X. 

Definition 2 / is said to be topologically transitive if, for 
any pair of open sets U,V C X, there exists fc > such that 

fiu) nv ^0. 

Definition 3 An element (a point) a; is a periodic element 
(point) for / of period n G F*, if /"(a;) = x. 

Definition 4 / is said to be regular on {X,t) if the set of 
periodic points for / is dense in X: for any point x in X, any 
neighborhood of x contains at least one periodic point. 

Definition 5 / is said to be chaotic on {X, t) if / is regular 
and topologically transitive. 

The chaos property is strongly linked to the notion of 
"sensitivity", defined on a metric space {X,d) by: 

Definition 6 / has sensitive dependence on initial conditions 
if there exists (5 > such that, for any x ^ X and any 
neighborhood V of x, there exists y £ V and n ^ such 
that d{r{x),r{y))> 6. 
S is called the constant of sensitivity of /. 

Indeed, Banks et al. have proven in p2[ that when / is 
chaotic and {X, d) is a metric space, then / has the property 
of sensitive dependence on initial conditions (this property 
was formerly an element of the definition of chaos). To sum 
up, quoting Devaney in |10|, a chaotic dynamical system 
"is unpredictable because of the sensitive dependence on 
initial conditions. It cannot be broken down or simplified into 
two subsystems which do not interact because of topological 
transitivity. And in the midst of this random behavior, we 
nevertheless have an element of regularity". Fundamentally 
different behaviors are consequently possible and occur in an 
unpredictable way. 

3 ) Chaotic iterations and Devaney 's chaos: In this section 
we give outline proofs of the properties on which our study 
of chaotic neural networks is based. The complete theoretical 
framework is detailed in 1 1 3 1 . 

Denote by A the discrete boolean metric, A{x,y) = 
X ^ y. Given a function / : B'^ — > M^, define the function 
Ff : II; N] X B^ — > B^ such that 

Ff{k,E) = (E,.A{k,j) + f{E),.A{kJ)) , 



where + and . are the boolean addition and product operations, 
X is for the negation of x. 

Consider the phase space A' = [[l;N]'^xB'^ and the map 

Gf iS,E) = {a{S),Ff{t{S),E)). 

where the shift function is defined by a : (S'")„g]N G § 
(S'"+^)„g]N G S, and the initial function i is the map which 
associates to a sequence, its first term: i : {S")ne'K G § i-^ 
50g I1;N1. 

Thus chaotic iterations can be described by the following 
iterations p3| 

( x'^ex 

Let us define a new distance between two points 

iS,E),iS,E) eXhy 

d{{S, E); {S, E)) = d,{E, E) + ds{S, 5), 

where 

N 

. de{E, E) = Y, ^{Ek,Ek) G [0; N] 

k=l 

fc=i 

This new distance has been introduced in p3) to satisfy 
the following requirements. When the number of different 
cells between two systems is increasing, then their distance 
should increase too. In addition, if two systems present the 
same cells and their respective strategies start with the same 
terms, then the distance between these two points must be 
small because the evolution of the two systems will be the 
same for a while. The distance presented above follows these 
recommendations. Indeed, if the floor value [d{X, Y)\ is equal 
to n, then the systems E, E differ in n cells. In addition, 
d{X, Y) — [d{X, Y)\ is a measure of the differences between 
strategies S and S. More precisely, this floating part is less 
than 10"*^ if and only if the first k terms of the two strategies 
are equal. Moreover, if the fc*'' digit is nonzero, then the fc*'' 
terms of the two strategies are different. 

It is proven in |13| by using the sequential continuity that 
the vectorial negation fo{xi, . . . ,xn) — {xi, . . . ,3Ji\i") satisfies 
the following proposition: 

Proposition 1 G/,, is a continuous function on {X,d). 

It is then checked, also in fT3l, that in the metric space 
{X, d), the vectorial negation fulfill the three conditions for 
Devaney's chaos: regularity, transitivity, and sensitivity. This 
has led to the following result. 

Proposition 2 Gfg is a chaotic map on {X,d) in the sense 
of Devaney. 

B. Neural Networks 

An artificial neural network is a set of simple processing 
elements called neurons that are interconnected, usually with 
a layer structure. It takes some input values and produces some 




Fig. 1. Description of a neuron 



output ones. Like a biological neural network, the connections 
between neurons influence the outputs given by the artificial 
network. Thanks to a training process, an ANN is able to learn 
complex relationships between inputs and outputs. A neuron j 
computes an output y = (p{x,w) where (p{) is the activation 
function, x is the input vector, and w the parameter vector, w 
can be used to parameterize ip or the neuron inputs. In this 
last case it means that the connections are weighted and a 
vector w component is then referred to as a synaptic weight. 
Figure [T] describes a neuron j with weighted connections. Its 
output yj satisfies: 

Vj = ^'3^^ + = mjx^ (1) 

where xa ~ —1, x = {xi, . . . ,x„), and woj = —bj defines 
the bias value. 

Neural networks have a layered architecture, but they may 
differ in the way the output of a neuron affect himself. In 
fact, based on the connection graph, two kinds of networks 
can be distinguished: those having at least one loop and those 
without any one. A neural network which exhibits a loop is 
called a feedback (or recurrent) network, whereas a network 
belonging to the second class is said feed-forward. Obviously, 
a feedback network can be seen as a dynamical system. In the 
following sections, we use a recurrent version of the multi- 
layer perceptron (MLP), a well-known ANN architecture for 
which the universal approximation property has been proven 
in the feed-forward context p4j . Typically, a MLP consists 
in a layer of input neurons, in one or more layers of hidden 
neurons, and a layer of output neurons. Since an input neuron 
is simply used as a channel to dispatch an input to each 
neuron of the first hidden layer, we will not further consider 
the input layer. Usually, the neurons of a given layer have 
similar characteristics and each one is fully connected to the 
next layer. Finally, it can be noticed that the number of inputs 
and output neurons is completely specified by the considered 
problem, while the number of hidden neurons depends directly 
on the complexity of the relationships to be learned by the 
ANN. 

As said previously, a neural network is designed to model 
relationships between inputs and outputs. In order to find a 
proper modeling, an ANN must be trained so that it provides 
the desired set of output vectors. The training (or learning) 
process consists mainly in feeding the network with some 



input vectors and updating the neurons parameters (weights 
and bias value) using a learning rule and some information 
which reflects the quality of the current modeling. When 
the expected output vectors (Dk) are known in advance, the 
quality can be expressed through the Mean-Squared Error p3] : 

1 " 

= ^ E (^'^ - ^fe) ^2) 

k = l 

where N is the number of input-output vector pairs used to 
train the ANN (the pair set is called the training or learning 
set) and denotes an output vector produced by the output 
layer for a given input vector Xk- Consequently, in that case 
the training process, which is said supervised, results in an 
optimization algorithm targeted to find the weights and biases 
that minimize the MSB. Various optimization techniques exist, 
they have given raise to distinct training algorithms perform- 
ing iterative parameters update. Gradient based methods are 
particularly popular due to the backpropagation algorithm, but 
they are sensitive to local minima. Heuristics like simulated 
annealing or differential evolution permit to find a global 
minimum, but they have a slow convergence. To control the 
training process, two methods are the most commonly used: 
firstly the number of iterations, also called epochs, reaches an 
upper bound, secondly the MSE goes below a threshold value. 

III. Related Work 

Since a while neuroscientists discuss the existence of chaos 
in the brain. In the context of artificial neural networks, this 
interest has given raise to various works studying the modeling 
of chaos in neurons. The chaotic neuron model designed by 
Aihara et al. 1 16] is particularly used to build chaotic neural 
networks. For example, in fTTl is proposed a feedback ANN 
architecture which consists of two layers (apart from the 
input layer) with one of them composed of chaotic neurons. 
In their experiments, the authors showed that without any 
input sequence the activation of each chaotic neuron results 
in a positive average Lyapunov exponent, which means a true 
chaotic behavior. When an input sequence is given iteratively 
to the network the chaotic neurons reach stabilized periodic 
orbits with different periods, and thus potentially provide a 
recognition state. Similarly, the same authors have recently 
introduced another model of chaotic neuron: the non-linear 
dynamic state (NDS) neuron, and used it to build a neural 
network which is able to recognize learned stabilized periodic 
orbits identifying patterns |18J. 

Today, another field of research in which chaotic neural 
networks have received a lot of attention is data security. 
In fact, chaotic cryptosystems are an appealing alternative to 
classical ones due to properties such as sensitivity to initial 
conditions or topological transitivity. Thus chaotic ANNs have 
been considered to build ciphering methods, hash functions, 
digital watermarking schemes, pseudo-random number gener- 
ators, etc. In 1 8 1 such a cipher scheme based on the dynamics 
of Chua's circuit is proposed. More precisely, a feed-forward 
MLP with two hidden layers is built to learn about 1500 input- 
output vector pairs, where each pair is obtained from the three 
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Fig. 2. Example of global recurrent neural network modeling function Ffg 
such that = = F/o (x'l^x^)) 

nonlinear ordinary differential equations modeling the circuit. 
Hence, the proposed chaotic neural network is a network 
which is trained to learn a true chaotic physical system. In 
the cipher scheme the ANN plays the role of chaos generator 
with which the plain-text will be merged. Untrained neural 
networks have been also considered to define block ciphering 
|[9^| or hash functions |7|. The background idea is to exploit the 
inherent properties of the ANNs architecture such as diffusion 
and confusion. 

IV. A First Recurrent Neural Network 
Chaotic According to Devaney 

A. Defining a First Chaotic Recurrent Neural Network 

We will now explain how to build a chaotic neural network 
using chaotic iterations. 

Let us reconsider the vectorial negation function denoted by 
fa ^ and its associated map Ff„ : |1; N] x ^ 
B'^. Firstly, it is possible to define a MLP which recognize 
Ffg. That means, for all (fc, x) G Jl; N] x B'^, the response of 
the output layer to the input {k,x) is Ffg{k,x). Secondly, 
the output layer can be connected to the input layer as it 
is depicted in Figure |2j leading to a global recurrent neural 
network working as follows: 

• At the initialization stage, the ANN receives a boolean 
vector x° G B'^ as input state, and 5° € |1; N] in its 
input integer channel Thus, x^ — Ffg{S'^ ,x'~') E B'^ 
is computed by the neural network. 

• This state x^ is published as an output. Additionally, x^ 
is sent back to the input layer, to act as boolean state in 
the next iteration. 

• At iteration number n, the recurrent neural network 
receives the state a;" e B'^ from its output layer and 
i (5") € |1; N] from its input integer channel It can 
thus calculate a;"+i = Ff„{i (S"') , x") e B^, which will 
be the new output of the network. 

In this way, if the initial state a;° € B'^ is sent to the 
network with a sequence S £ |1; N]'^ applied in the input 



integer channel iQ, then the sequence of the outputs 

is exactly the same than the sequence obtained from the 
following chaotic iterations: x° G B'^ and 

r T-"-! if <?" ^ j 

VneM%V^eIl;Nl,xr = | if 5" S z. 

From a mathematical viewpoint, the MLP defined in this 
subsection and chaotic iterations recalled above have the 
same behavior. In particular, given the same input vec- 
tor (a;°, (5")„g]N), they produce the same output vector 
(2^")„g]N»- they are two equivalent reformulations of the 
iterations of G/^ in A". As a consequence, the behavior of our 
MLP faithfully reflects the behavior of G/^ which is chaotic 
according to Devaney. 

B. Improving the Variety of Chaotic Recurrent Neural Net- 
works 

The approach proposed to build chaotic neural networks, 
explained in the previous subsection, is not restricted to an 
adhoc function /o : B^ ^ B'^, it can be generalized as 
follows. The function Ff^ associated to the vectorial negation 
/o, which has been recognized by the neural network, can be 
replaced by any functions Ff : |1; N] x B'^ ^ B^ such that 
the chaotic iterations G/ are chaotic, as defined by Devaney. 

To be able to define functions that can be used in this 
situation, we must firstly introduce the graph of iterations of 
a given function / : B^ ^ B^, a; ifiix), . • • , fn{x)). 

Let be given a configuration x. In what follows the configu- 
ration N{i,x) — {xi , . . . ,0^, . . . ,Xn) is obtained by switching 
the i— th component of x. Intuitively, x and N{i, x) are 
neighbors. The chaotic iterations of the function / can be 
represented by the graph r(/) defined below. 

Definition 7 (Grapli of iterations) In the oriented graph of 
iterations r(/), vertices are configurations of B'^ and there is 
an arc labeled i from x to N{i,x) iff Ff{i,x) is N{i,x). 

We have proven in |5| that: 

Tlieorem 1 Functions f : B" — >■ B" such that Gj is chaotic 
according to Devaney, are functions such that the graph T{ f) 
is strongly connected. 

Since it is easy to check whether a graph is strongly 
connected, we can use this theorem to discover new functions 
/ : B'^ — > B'^ such that the neural network associated to G/ 
behaves chaotically, as defined by Devaney. 

C. The Discovery of New Chaotic Neural Networks 

Considering Theorem [T[ it is easy to check that 
fi{xi, . . . ,xn) = {xT,xi,X2, ■ ■ ■ ,Xf^^i) is such that G/^ 
behaves chaotically, as defined by Devaney. Consequently, we 
can now obtain two chaotic neural networks by learning either 

Ffo or Ff,. 

To support our approach, a set of illustrative examples 
composed of five neural networks is given. The three first 
networks are respectively defined by: 

• fo,lixi,X2,X3,X4) = {xi,X^,X^,Xl), 



TABLE I 

Outline of the results from several iteration functions 
learning using different recurrent mlp architectures 





One hidden layer 




8 neurons 


10 neurons 


Function 


Mean epoch 


Success rate 


Mean epoch 


Success rate 


fo.2 


82.21 


100% 


73.44 


100% 


fl.l 


76.88 


100% 


59.84 


100% 


91 


36.24 


100% 


37.04 


100% 




Two hidden layers: 8 and 4 neurons 




Mean epoch number 


Success rate 


fo.2 


203.68 


76% 


/l.l 


135.54 


96% 


91 


76.56 


100% 



while the last ones are defined by: 

. go(a;i,a;2,a;3) = (a;i,a:2,a;3), 

• 9l{xi,X2,X3) = {X^,X2,X3). 

Due to Theorem[r| the ANNs associated to /o,i, /o,2 and /i.i 
behave chaotically, as defined by Devaney. Whereas it is not 
the case for the networks based on the boolean functions go 
and gi, since ^{go) and T{gi) are not strongly connected. 

D. Experimental results 

Among the five neural networks evoked in the previous 
subsection we decided to study the training process of 
three of them. Note also that for each neural network 
we have considered MLP architectures with one and two 
hidden layers, with in the first case different numbers of 
hidden neurons (sigmoidal activation). Thus we will have 
different versions of a neural network modeling the same 
iteration function. Only the size and number of hidden 
layer may change, since the numbers of inputs and output 
neurons (linear activation) are fully specified by the function. 
The neural networks are trained using the quasi-Newton 
L-BFGS (Limited-memory Broyden-Fletcher-Goldfarb- 
Shanno) algorithm in combination with the Wolfe linear 
search. The training is performed until the learning error 
(MSB) is lower than a chosen threshold value (10^^). 

Table [I] gives for each considered neural network the mean 
number of epochs needed to train them and a success rate 
which reflects a successful training in less than 1000 epochs. 
Both values are computed considering 25 trainings with ran- 
dom weights and biases initialization. These results highlight 
several points. Firstly, various MLP architectures can learn 
a same iteration function, with obviously a best suited one (a 
hidden layer composed of ten sigmoidal neurons). In particular 
the two hidden layer structure seems to be too complex for 
the functions to be learned. Secondly, training networks so that 
they behave chaotically seems to be more difficult, since they 
need in average more epochs to be correctly trained. However, 
the relevance of this point needs to be further investigated. 
Similarly, there may be a link between the training difficulty 



and the disorder (evaluation of their constants of sensitivity, 
expansivity, etc.) induced by a chaotic iteration function. 

V. Conclusion and future work 

Many chaotic neural networks have been developed for 
different fields of application, in particular for data security 
purpose where they are used to define ciphering methods, hash 
functions and so on. Unfortunately, the proposed networks are 
usually claimed to be chaotic without any proof. In this paper 
we have presented a rigorous mathematical framework which 
allows us to construct artificial networks proven chaotic, ac- 
cording to Devaney. More precisely, a correspondence between 
chaotic iterations, which are a particular case of topological 
chaos in sense of Devenay, and MLP neural networks with 
a global feedback is established. In fact, we have shown 
that an iteration function is chaotic if its graph of iteration 
is strongly connected (a property easily checked), and that 
a global recurrent MLP can learn such a function. Future 
research will study more carefully the performance of the 
training process and alternative neural network architectures. 
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